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immunoglobuling superfamily (IgSF) domains and derivatives 
thereof so as to ' increase "their solubility, and hence the 
yield, and ease of handling. 

, ■■ - . .Small antibody fragments show exciting promise for 
use' as therapeutic agents/ diagnostic reagents, and for 
-biochemical research. Thus, they are needed in large amounts, 
and the expression of antibody fragments, e.g. Fv, 
/single-chain -Fv (scFv) , or Fab in the periplasm of E. coli 
(Skerra & Pluck'thun 1988; Better et al., 1988) is now used 
routinely in many laboratories. Expression yields vary 
widely, however, especially in the case of scFvs. While some 
fragments yield up to several mg of functional, soluble 
protein per litre and OD of culture broth in shake flask 
culture (Carter et al . , 1992, Pluckthun et al . 1996), other 
fragments may almost exclusively lead to insoluble material, 
often found in so-called inclusion bodies. Functional protein 
may be obtained from the latter in modest yields by a 
laborious and time-consuming refolding process. The factors 
influencing antibody expression levels are still only poorly 
understood. 

Folding efficiency and stability of the antibody 
fragments, protease lability and toxicity of the expressed 
proteins to the host cells often severely limit actual 
production levels, and several attempts have been tried to 
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increase expression yields. For example, Knappik & Pluckthun 
(1995) have identified key residues in the antibody framework 
which influence expression yields dramatically. Similarly, 
Ullrich et al. (1995) found that point mutations in the CDRs 
5 can increase the yields in periplasmic antibody fragment 
expression. Nevertheless, these strategies are only 
applicable to a few antibodies. 

The observations by Knappik & Pluckthun (1995) 
indicate that optimizing those parts of the antibody fragment 
10 which are not directly involved in antigen recognition can 

significantly improve folding properties and production yields 
of recombinant Fv and scFv constructs. The causes for the 
improved expression behavior lie in the decreased aggregation 
behavior of these molecules. For other molecules, fragment 
15 stability and protease resistance may also be affected. The 
understanding of how specific sequence modifications change 
these properties is still very limited and currently under 
active investigation. 

Difficulties in expressing and manipulating protein 
20 domains may arise because amino acids which are normally 

buried within the protein structure become exposed when only a 
portion of the whole molecule is expressed. Aggregation may 
occur through interaction of newly solvent-exposed hydrophobic 
residues originally forming the contact regions between 
25 adjacent domains. Leistler and Perham (1994) could show that 
a certain domain of glutathione reductase may be expressed 
separately from its neighboring domains, but the protein 
showed non-specific association in vitro forming multimeric 
protein species. The introduction of hydrophilic residues 
30 instead of exposed hydrophobic amino acids could decrease this 
aggregation tendency and thus stabilize this isolated domain. 
Both wild type and modified domains were exclusively found in 
inclusion bodies and had to be refolded. Although in vitro 
experiments contributed a lot to define various intermolecular 
35 interactions, which drive folding processes, they are only of 
limited value in predicting the folding behaviour of different 
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polypeptide chains in vivo (Gething & Sambrook, 1992) . Thus, 
Leistler and Perham do not teach or suggest how to increase 
expression yields of soluble protein domains. 

In the case of antibodies, two chains comprising 
5 several domains dimerize, each domain consisting of a b-barrel 
whose two b-sheets are held together by a disulphide bond, 
forming the so-called immunoglobulin fold. Two domains, one 
variable domain (VL) and one constant domain (CD are adjacent 
along the longitudinal axis in the light chain (VL-CD , and 

10 four domains, one variable domain (VH) and three constant 

domain {CHI to CH3) are adjacent along the longitudinal axis 
in the heavy chain (VH-CH1-CH2-CH3) . In the dimer formed by 
chains a and b, two such domains associate laterally: VLa with 
VHa, CLa with CHla, VLb with VHb, CLb with CHlb CH2a with CH2b 

15 and CH3a with CH3b. In WO 92/01787 (Johnson et al., 1992), it 
is taught that isolated single domains, e.g. VH, can be 
modified in the former VL/VH interface region by exchanging 
hydrophobic residues by hydrophilic ones without changing the 
specificity of the parent domain. The rationale for WO 

20 92/01787 was the assumption that exposed hydrophobic residues 
might lead to non-specific binding, interaction with surfaces 
and decreased stability. Data for increase in binding 
specificity was given, but increase in expression level was 
not shown. Furthermore, WO 92/01787 would not be applicable to 

25 any antibody fragment containing the complete antigen binding 
site, as it must contain VL and VH. In the case of T 

cell receptors, two chains (a and b) dimerize, each consisting 
of a variable (V) and a constant (C) domain with the 
immunoglobulin fold, and one transmembrane domain. In each 

30 chain, the variable and constant domains are adjacent along 
the longitudinal axis in the chains (Va-Ca; Vb-Cb) and 
associate laterally with the corresponding domains of the 
second chain (Va-Vb; Ca-Cb) . 

Various other molecules of the immunoglobulin 

35 superfamily, such as CD2, CD4, CD16, CD22, comprise only one 
chain, wherein two or more domains (variable and/or constant) 
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with the immunoglobulin fold are adjacent along the 
longitudinal axis in the chains. 
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Summary of the Invention 

The present inventors have found that expression 
problems are largely associated with a part of the molecule 
that has hitherto not been regarded relevant for expression 
studies and which comprises the interface between adjacent 
domains within an immunoglobulin chain. This surprising 
finding forms the basis of the present invention, which 
provides a general solution to the problems associated with 
production of domains or fragments of the immunoglobulin 
superfamiliy (IgSF), especially antibody fragments, which 
exhibit poor solubility or reduced levels of expression. 

In addition to lateral interactions between domains 
of different chains described above, there are well documented 
contacts between adjacent domains within individual chains 
along the longitudinal axis. For example, in the case of an 
antibody (Lesk & Chothia, 1988), the "bottom" of VL makes 
contact with the "top" of CL, and, in a similar manner there 
are contacts between VH and CHI. The contacts at these 
inter-domain interfaces are probably essential for the compact 
arrangement of the Fab fragment, and, as is typical for such 
25 contacts, are at least partially hydrophobic in nature (Lesk & 

Chothia, 1988) . 

The basis of the present invention is the surprising 
finding that the solubility (and hence the yield) of antibody 
fragments comprising at least one domain can be dramatically 

30 increased by decreasing the hydrophobicity of former 

interfaces at the "end" of said domain, where it would 
normally adjoin a second domain within a chain in a larger 
antibody fragment or full antibody. This is surprising and 
could not have been predicted from the prior art (WO 

35 92/01787), because the size of the longitudinal interface, for 
example, in a scFv fragment, is much smaller than that between 
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VH and VL, and therefore, the amino acids which make up the 
interfaces between VH and CHI or between VL and CL in a Fab 
fragment represent a much smaller proportion of the total 
surface area of the scFv molecule, and would accordingly be 

5 expected to play less of a role in determining the physical 
properties of the molecule. 

The present invention has the additional advantage 
that because the alterations effected in the molecules that 
lead to said decreased hydrophobicity of former interfaces are 

10 located at the most distant part of the domain from the CDRs, 
applying the invention is unlikely to have a deleterious 
effect on the binding properties of the molecule. This is not 
the case in WO 92/01787, where at least one modification is 
close to the CDRs and may therefore be expected to have an 

15 impact on antigen binding. Furthermore, WO 92/01787 cannot be 
applied to VL/VH heterodimers, as explained above. 

Brief Description of the D rawings 

20 Figure 1 provides a space filling representation of 

the Fv fragment of the antibody 4-4-20. 

Figure 2 presents the variable/constant domain 
interface residues for VL (2a) and VH (2b) . For 30 
25 non-redundant Fab fragments taken from the Brookhaven 

Databank, the solvent accessible surface of the amino acid 
side chains was calculated in the context of an Fv and of an 
Fab fragment. The plot shows the relative reduction in 
accessible surface upon contact with the constant domains 
30 (overlay plot for all 30 Fv fragments) . In the sequence 
alignment, residues contributing to the v/c interface are 
highlighted. The symbols indicate the relative reduction of 
solvent accessible surface upon removing the constant domains 
(symbols: no symbol < 1%; 1 < 20%; n < 40%; s < 60%; t < 80%, 
35 and u 3 80%) . Circles indicate those positions which are 
further analyzed (see Table 1) . 
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Figure 3 presents Western blots showing the 
insoluble (i) and soluble (s) fractions of cell extracts, 
prepared as described in Material and Methods, expressing the 
scFv fragments of the antibody 4-4-20. The amino acids 
5 substituted in the various mutants are given in Table 2. 

Figure 4 presents a Scatchard plot of the 
fluorescence titration of fluorescein (20 nM) with antibody (4 
to 800 nM), measured at 510 nm. The value r was obtained from 

10 (F-Fo)/(F¥-Fo) , where F is the measured fluorescein 

fluorescence at a given antibody concentration, Fo is the 
fluorescence in the absence of antibody and F¥ when antibody 
is present in large excess. Note that r gives the saturation 
of fluorescein by antibody, (a) Titration of wt scFv, (b) 

15 titration of Flu4 (V84D) . 

Figure 5 presents an overlay plot of the urea 
denaturation curves ((X) wt scFv, (o) Flu4}. 

20 Figure 6 presents the thermal denaturation time 

courses at 40 and 44"C for wt and Flu4 scFv fragment ((a) wt 
scFv at 40*C, (b) Flu4 at 40*C, (c) Flu4 at 44*C, (d) wt scFv 
at 44'C) . 

25 Table 1 describes the sequence variability of 

residues contributing to the v/c interface. Residue 
statistics are based on the variable domain sequences in the 
Kabat database (March 1996) . Sequences which were <90% 
complete were excluded from the analysis. Number of sequences 

30 analyzed: human VL kappa: 404 of 881, murine VL kappa: 1061 of 
2239, human VL lambda: 223 of 409, murine VL lambda: 71 of 
206, human VH: 663 of 1756, murine VH: 1294 of 3849. Position 
refers to the sequence position according to Kabat et al. 
1991, %exp. (Fab) to the relative side chain accessibility in 

35 an Fab fragment as calculated by the program NACCESS (NACCESS 
v2.0 by Simon Hubbard, %exp. (ind.) to the relative side chain 
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accessibility in the isolated VL or VH domain, %buried to the 
relative difference in side chain accessibility between Fv and 
Fab fragment. Consensus refers to the sequence consensus, and 
Distribution to the distribution of residue types. 

Table 2 describes mutations introduced in the scFv 
fragment of the antibody 4-4-20. Each line represents a 
different protein carrying the mutations indicated. The 
residues are numbered according to Kabat et al. (1991). 

Table 3 describes KD values of the different scFv 
mutants determined in fluorescence titration. The KD values 
are given in nM, the error was calculated from the Scatchard 
analysis (Fig. 4). # determined by Miklasz et al. (1995). 



np.grription of the Preferred E mbodiments 

The present invention relates to a modified 
immunoglobulin superfamily (IgSF) domain or fragment which 
differs from a parent IgSF domain or fragment in that the 
region which comprised or would comprise the interface with a 
second domain adjoined to said parent IgSF domain or fragment 
within the protein chain of a larger IgSF fragment or a full 
IgSF protein, and which is exposed in said parent IgSF domain 
or fragment in the absence of said second domain, is made more 
25 hydrophilic by modification. 

In the context of the present invention, the term 
immunoglobulin superfamily (IgSF) domain refers to those parts 
of members of the immunoglobulin superfamily which are 
characterized by the immunoglobulin fold, said superfamiliy 
30 comprising the immunoglobulins or antibodies, and various 
other proteins such as T-cell receptors or integrins . The 
term IgSF fragment refers to any portion of a member of the 
immunoglobulin superfamily, said portion comprising at least 
one IgSF domain. The term adjoining domain refers to a domain 
35 which is contiguous with a first domain. The term interface 

refers to a region of said first domain where interaction with 
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the adjoining domain takes place. The terms hydrophobic and 
hydrophilic refer to a physical property of amino acids, which 
can be estimated quantitatively: tabulated values of 
hydrophobicity for the twenty naturally-occurring amino acids 
5 are available (Nozaki & Tanford, 1971; Casari & Sippl, 1992; 
Rose & Wolfenden, 1993) . 

The residues to be modified can be identified in a 
number of ways. For example, in one way, the solvent 
accessibilities (Lee & Richards, 1971) of hydrophobic 
10 interface residues in said parent IgSF fragment compared to 

said larger IgSF fragment or full IgSF protein are calculated, 
with high accessibilities indicating highly exposed residues. 
In a second way, the number of van der Waals contacts of v 
hydrophobic interface residues in said larger IgSF fragment or 
15 full IgSF protein is calculated. A large number for a residue 
of said parent domain indicates that it will be highly 
solvent-exposed in the absence of an adjoining domain. There 
are other ways of calculating or determining residues to be 
modified according to the present invention, and one of 
20 ordinary skill in the art will be able to identify and 
practice these ways. 

By analyzing computer models of said parent IgSF 
fragment, interactions of said highly exposed residues within 
the fragment can be identified. Such interactions could 
25 stabilize the parent IgSF fragment. Residues, which interact 
closely with other hydrophobic residues and which can be 
identified by anyone of ordinary skill in the art, should not 
preferentially be mutated. 

The modification referred to above may be effected 
30 in a number of ways which are well known to one skilled in the 
art. In a preferred embodiment, the modification is a 
substitution of one or more amino acids at the exposed 
interface, identified as described above, with amino acids 
which are more hydrophilic. Alternatively, one or more amino 
35 acids can be inserted in said interface, or one or more amino 
acids can be deleted from said interface, so as to increase 



its overall hydrophilicity . Furthermore, any combination of 
substitution, insertion and deletion can be effected to reduce 
the hydrophobicity of said interface. Also comprised by the 
present invention is the possibility that the substitution or 
insertion comprises amino acids with a relatively high 
hydrophobicity value, or that the deletion comprises amino 
acids with relatively low hydrophobicity value, as long as the 
overall hydrophilicity . value is increased in said interface 
region. Modifications such as substitution, insertion and 
deletion can be effected using standard methods which are well 
known to practitioners skilled in the art.. By way of example, 
the skilled artisan can use either site-directed or PCR-based 
mutagenesis (Ho et al . , 1989; Kunkel et al., 1991; Trower, 
1994; Viville, 1994), or total gene synthesis (Prodromou & 
Pearl, 1992) to effect the necessary modification (s) . In a 
further embodiment, the mutations may be obtained by random 
mutagenesis and screening of random mutants, using a suitable 
expression and screening system {see, for example, Stemmer, 
1994; Crameri et al., 1996). 

In a preferred embodiment, the amino acid(s) which 
replace (s) the more hydrophobic amino acids include Asn, Asp, 
Arg, Gin, Glu, Gly, His, Lys, Ser, and Thr. These are among 
the more hydrophilic of the 20 naturally-occurring amino 
acids, and have proven to be particularly effective in the 
application of the present invention. Said amino acids, alone 
or in combination, or in combination with other amino acids, 
can also be used to form the above mentioned insertion which 
makes the interface region more hydrophilic. 

The parent IgSF domain or fragment referred to above 
can be one of several different types. In a preferred 
embodiment, said parent domain or fragment is derived from an 
antibody. In one embodiment, said parent antibody fragment 
comprises an Fv fragment. In this context, the term Fv 
fragment refers to a complex comprising the VL (variable 
light) and VH (variable heavy) portions of the antibody 
molecule. In a further embodiment, the parent antibody 
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fragment may be a single-chain Fv fragment (scFv; Bird et al., 
1988; Huston et al., 1988), in which the VL and VH chains are 
joined, in either a VL-VH, or VH-VL orientation,, by a peptide 
linker. In yet a further embodiment, the parent antibody 
5 fragment may be an FV fragment stabilized by an inter-domain 
disulphide bond. This is a structure which can be made by 
engineering into each chain a single cysteine residue, wherein 
..said cysteine residues from two chains . become linked through 
oxidation to form a disulphide (Glockshuber et al,, 1990; 
10 Brinkmann et al., 1993). 

In a most preferred embodiment, the interface region 
of the variable domains mentioned above comprises residues 9, 
' 10, 12, 15, 39, 40, 41, 80, 81, 83, 103, 105, 106, 106A, 107, 
108 for VL, and residues 9, 10, 11, 13, 14, 41, 42, 43, 84, 
15 87, 89, 105, 108, 110, 112, 113 for VH according to the Kabat 
numbering system (Kabat et al., 1991). Said numbering system 
was established for the sequences of whole antibodies, but can 
be adapted correspondingly to describe the sequences of 
isolated antibody domains or antibody fragments, even in the 
20 case of scFv fragments, where VL and VH are connected via a 
peptide linker, and where the protein sequence from N- to 
C-terminus has to be numbered differently. This means that 
the Kabat numbering system is used in the present invention as 
a sequence description relative to the existing data on 
25 antibody sequences, not as an absolute description of actual 
positions within the antibody fragment sequences of interest. 

In a further embodiment, said parent antibody 
fragment comprises a Fab fragment. In this context, the term 
Fab refers to a complex comprising the VL-CL (variable and 
30 constant light) and VH-CH1 (variable and first constant heavy) 
portions of the antibody molecule, and the term interface 
region refers to a region in the first constant domain of the 
heavy chain (CHI) which is, or would be adjoined to, the CH2 
domain in a larger antibody fragment or full antibody. 
35 in a still further embodiment, said parent IgSF 

fragment is a fusion protein of any of said domains or 



fragments and another protein domain, derived from an antibody 
or any other protein or peptide. The advent of bacterial 
expression of antibody fragments has opened the way to the 
construction of proteins comprising fusions between antibody- ' 
fragments and other molecules. A further embodiment of the 
present invention relates to such fusion proteins by providing 
for a DNA sequence which encodes both the modified IgSF domain 
or fragment, as described above, as well as an additional 
moiety. Particularly preferred are moieties which have a 
useful therapeutic function. For example, the additional 
moiety may be a toxin molecule which is able to kill cells 
(Vitetta et al . , 1993). There are numerous examples of such 
toxins, well known those skilled in the art, such as the 
bacterial toxins Pseudomonas exotoxin A, and diphtheria toxin, 
as well as the plant toxins ricin, abrin, modeccin, saporin, 
and gelonin. By fusing such a toxin to an antibody fragment, 
the toxin can be targeted to, for example, diseased cells, and 
thereby have a beneficial therapeutic effect. Alternatively, 
the additional moiety may be a cytokine, such as IL-2 
(Rosenberg & Lotze, 1986), which has a particular effect (in 
this case a T-cell proliferative effect) on a family of cells. 
In a further preferred embodiment, the additional moiety is at 
least part of a surface protein which may direct the fusion 
protein to the surface of an organism, for example, a cell or 
a phage, and thereby displays the IgSF partner. Preferably, 
the additional moiety is at least part of a coat protein of 
filamentous bacteriophages, most preferably of the genelll 
protein. In a further embodiment, the additional moiety may 
confer on its IgSF partner a means of detection and/or 
purification. For example, the fusion protein could comprise 
the modified IgSF domain or fragment and an enzyme commonly 
used for detection purposes, such as alkaline phosphatase 
(Blake et al., 1984). There are numerous other moieties which 
can be used as detection or purification tags, which are well 
known to the practitioner skilled in the art. Particularly 
preferred are peptides comprising at least five histidine 
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residues (Hochuli et al., 1988), which are able to bind to 
metal ions, and can therefore be used for the purification of 
the protein to which they are fused (Lindner et al., 1992). 
Also provided for by the invention are additional moieties 

5 such as the commonly used c-myc and FLAG tags (Hopp et al., 
1988; Knappik & Pluckthun, 1994) . 

By engineering one or more fused additional domains, 
IgSF domains or fragments can be assembled into larger 
molecules which also fall under the scope of the present 

10 invention. To the extent that the physical properties of the 
IgSF domain or fragment determine the characteristics of the 
assembly, the present invention provides a means of increasing 
the solubility of such larger molecules. For example, 
mini-antibodies (Pack, 1994) are dimers comprising two 

15 antibody fragments, each fused to a self-associating 
dimerization domain. Dimerization domains which are 
. particularly preferred include those derived from a leucine 
zipper (Pack & Pluckthun, 1992) or helix-turn-helix motif 

(Pack et al., 1993) . 
20 All of the above embodiments of the present 

invention can be effected using standard techniques of 
molecular biology known to anyone skilled in the art. 

The compositions described above may have utility in 
any one of a number of settings. Particularly preferred are 
25 diagnostic and therapeutic compositions. 

The present invention also provides methods for 
making the compositions and compounds comprised therein 
described above. Particularly preferred is a method 
comprising the following steps: 
30 i) analyzing the interface region of an IgSF domain for 

hydrophobic residues which are solvent-exposed using either a 
solvent-accessibility approach (Lee & Richards, 1971), 
analysis of van der Waals interactions in the interface 
region, or similar methods which are well known to one skilled 

35 in the art, 

ii) identifying one or more of the hydrophobic residues to be 
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substituted by more hydrophilic residues, or one or more 
positions where hydrophilic residues or amino acid stretches 
enhancing the overall hydrophilicity of the interface region 
can be inserted into said interface region, or one or more 
positions where hydrophobic residues or amino acid stretches 
enhancing the overall -hydrophobicity of the interface region 
can be deleted from said interface region, or any combination 
of said substitutions, said insertions, and said deletions to 
give one or more mutants of said parent IgSF domain, 

iii) preparing DNA encoding mutants of said IgSF domain, 
characterized by the changes identified in ii), by using e.g. 
conventional mutagenesis or gene synthesis methods, said DNA 
being prepared either separately or as a mixture, 

iv) introducing said DNA or DNA mixture in a vector system 
15 suitable for expression of said mutants, 

v) introducing said vector system into suitable host cells 
and expressing said mutant or mixture of mutants, 

vi) identifying and characterizing mutants which are obtained 
in higher yield in soluble form, and 

vii) if necessary, repeating steps iii) to vi) to increase the 
hydrophilicity of said identified mutant or mutants further. 

The host referred to above may be any of a number 
commonly used in the production of heterologous proteins, 
including but not limited to bacteria, such as E. coli (Ge et 
al, 1995), or Bacillus subtilis (Wu et al., 1993), fungi, such 
as yeasts (Horwitz et al., 1988; Ridder et al., 1995) or 
filamentous fungus (Nyyssonen et al., 1993), plant cells 
(Hiatt, 1990, Hiatt & Ma, 1993; Whitelam et al . , 1994), insect 
cells (Potter et al., 1993; Ward et al., 1995), or mammalian 
30 cells (Trill et al., 1995). 

The invention also relates to a method for the 
production of an IgSF domain or fragment of the invention 
comprising culturing a host cell of the invention and 
isolating said domain or fragment. 
35 T he invention is now demonstrated by the following 

examples, which are presented for illustration only and are 
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not intended to limit the scope of the invention. 



Examples 

i) Abbreviations 

Abbreviations are defined as follows: CDR: 
complementarity determining region; dsFv: disulf ide-linked Fv 
fragment; IMAC: immobilized metal ion affinity chromatography; 
IPTG: isopropyl-b-D-thiogalactopyranoside; i/s: ratio 
insoluble/soluble; H{X): heavy chain residue number X; L(X): 
light chain residue number X; NTA: nitrilo-triacetic acid; 
OD550: optical density at 550 nm; PDB: protein database; scFv: 
single-chain Fv fragment; SDS-PAGE: sodium dodecyl sulfate 
polyacrylamide gel electrophoresis; v/c: variable/constant; 
wt: wild type. 



ii) Material and Methods 



(a) Calculation of solvent accessibility 

Solvent accessible surface areas for 30 
non-redundant Fab fragments and the Fv fragments derived from 
these by deleting the constant domain coordinates from the PDB 
file were calculated using the latest version, as of March 
1996, of the program NACCESS 

(http://www.biochem.ucl.ac.uk/-roman/naccess/naccess) based on 
the algorhithm described by Lee & Richards (1971) . 



(b) scFv gene synthesis 



The single-chain Fv fragment (scFv) in the 
orientation VL-linker-VH of the antibody 4-4-20 {Bedzyk et 
al., 1990) was obtained by gene synthesis (Prodromou and 
Pearl, 1992) . The VL domain carries a three-amino acid long 
FLAG tag (Knappik and Pliickthun, 1994). We have used two 
different linkers with a length of 15 (Gly4Ser)3 and 30 amino 
acids (Gly4Ser)6, respectively. The gene so obtained was 
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cloned into a derivative of the vector pIG6 (Ge et al., 1995). 
The mutant antibody fragments were constructed by 
site-directed mutagenesis (Kunkel et al., 1987) using 
single-stranded DNA and up to three oligonucleotides per 
reaction. 

(c) Expression 

Growth curves were obtained as follows: 20 ml of 
2xYT medium containing 100 g/ml ampicillin and 25 g/ml 
streptomycin were inoculated with 250 1 of an overnight 
culture of E. coli JM83 harboring the plasmid encoding the 
respective antibody fragment and incubated at 24.5*C until an 
OD550 of 0.5 was reached. IPTG {Biomol Feinchemikalien GmbH) 
was added to a final concentration of 1 mM and incubation was 
continued for 3 hours. The OD was measured every hour, as was 
the b-lactamase activity in the culture supernantant to 
quantify the degree of cell leakiness. Three hours after 
induction an aliquot of the culture was removed and the cells 
were lysed exactly as described by Knappik and Pluckthun 
(1995) . The b-lactamase activity was measured in the 
supernatant, in the insoluble and in the soluble fraction. 
The fractions were assayed for antibody fragments by reducing 
SDS-PAGE, with the samples normalized to OD and b-lactamase 
activity to account for possible plasmid loss as well as for 
cell leakiness. The gels were blotted and immunostained using 
the FLAG antibody Ml (Prickett et al., 1989) as the first 
antibody, an Fc-specific anti-mouse antiserum conjugated to 
horseradish peroxidase (Pierce) as the second antibody, using 
a chemoluminescent detection assay described elsewhere (Ge et 
al., 1995) . 

(d) Purification 

Mutant scFv fragments were purified by a two-column 
procedure. After French press lysis of the cells, the raw E. 



coli extract was first purified by IMAC (Ni-NTA superflow, 
Qiagen) (20 mM HEPES, 500 mM NaCl, pH 6.9; step gradient of 
imidazole 10, 50 and 200 mM) (Lindner et al., 1992) and, after 
dialyzing the IMAC eluate against ,20 mM MES pH 6.0, finally 
purified by cation exchange chromatography (S-Sepharose fast 
flow column, Pharmacia) (20 mM MES, pH 6.0; salt gradient 
0-500 mM NaCl) . Purity was controlled by Coomassie stained 
SDS-PAGE. The functionality of the scFv was tested by 
competition ELISA. 

Because of its very poor solubility in the 
periplasmic system, the wt 4-4-20 was expressed as cytoplasmic 
inclusion bodies in the T7-based system (Studier & Moffatt, 
1986; Ge et al . , 1995). The refolding procedure was carried 
out as described elsewhere (Ge et al . , 1995). For 
purification, the refolding solution (2 1) was loaded over 10 
h without prior dialysis onto a fluorescein affinity column, 
followed by a washing step with 20 mM HEPES, 150 mM NaCl, pH 
7.5. Two column volumes of 1 mM fluorescein (sodium salt, 
Sigma Chemicals Co.) pH 7.5 were used to elute all functional 

scFv fragment. Extensive dialysis (7 days with 12 buffer 

changes) was necessary to remove all fluorescein. All 
purified scFv fragments were tested in gel filtration 

(Superose-12 column, Pharmacia SMART-System, 20 mM HEPES, 150 

mM NaCl, pH 7.5) . 

(e) KD determination by fluorescence titration 

The concentrations of the proteins were determined 
photometrically using an extinction coefficient calculated 
according to Gill and von Hippel (1989) . Fluorescence 
titration experiments were carried out by taking advantage of 
the intensive fluorescence of fluorescein. Two ml of 20 mM 
HEPES, 150 mM NaCl, pH 7.5 containing 10 or 20 nM fluorescein 
were placed in a cuvette with integrated stirrer. The 
excitation wavelength was 485 nm, emission spectra were 
recorded from 490 to 530 nm. Purified scFv (in 20 mM HEPES, 



150 mM NaCl, pH 7.5) was added in 5 to 100 1 aliquots, and 
after a 3 min equilibration time a spectrum was recorded. All 
spectra were recorded at 20* C. The maximum of emission at 510 
nM was used for determining the degree of complexation of scFv 
to fluorescein, seen as quenching as a function of the 
concentration of the antibody fragment. The KD value was 
determined by Scatchard analysis. 

(f) Equilibrium denaturation measurement 

Equilibrium denaturation curves were obtained by 
denaturation of 0.2 M protein in HEPES buffered saline (HBS) 
buffer (20 mM HEPES, 150 mM NaCl, 1 mM EDTA, pH 7.5) and 
increasing amounts of urea (1.0-7.5 M; 20 mM HEPES, 150 mM 
NaCl, pH 7.4; 0.25 M steps) in a total volume of 1.7 ml. 
After incubating the samples at 10 *C for 12 hours and an 
additional 3 hours at 20 'C prior to measurements, the 
fluorescence spectra were recorded at 20*C from 320-360 nm 
with an excitation wavelength of 280 nm. The emission 
wavelength of the fluorescence peak shifted from 341 to 347 nm 
during denaturation and was used for determining the fraction 
of unfolded molecules. Curves were fitted according to Pace 
(1990) . 

(g) Thermal denaturation 

For measuring the thermal denaturation rates, 
purified scFv was dissolved in 2 ml HBS buffer to a final 
concentration of 0.5 M. The aggregation was followed for 2.5 
h at 40'C and at 44'C by light scattering at 400 nm. 

iii) Results 

(a) Comparison of known antibody sequences 



Compared to other domain/domain interfaces in 
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proteins, the interface between immunoglobulin variable and 
constant domains is not very tightly packed. A comparison of 
30 non-redundant Fab structures in the PDB database showed 
that between the light chain variable and constant domain an, 
5 area of 410* ± 90 A2 per domain is buried, while the heavy 

chain variable and constant domains interact over an area of . 
710 ± 180 A2. Some, but not all of the interface residues are 
hydrophobic, predominantly aliphatic. Generally, sequence 
conservation of the residues contributing to the v/c domain 
10 interface is not particularly high. Still, the v/c domain 
interface shows up as a marked hydrophobic patch on the 
surface of an Fv fragment (Fig. 1). _ 

Solvent accessible surface areas for 30 
non-redundant Fab fragments and their corresponding Fv 
15 fragments (derived from the Fab fragment by deleting the 

constant domain coordinates from the PDB file) were calculated 
using the program NACCESS (Lee & Richards, 1971). Residues 
participating in the v/c domain interface were identified by 
comparing the solvent-accessible surface area of each amino 
20 acid side chain in the context of an Fv fragment to its 

accessible surface in the context of an Fab fragment. Figure 
2 shows a plot of the relative change in side chain 
accessibility upon deletion of the constant domains as a 
function of sequence position. Residues which show a 
25 significant reduction of side chain accessibility are also 
highlighted in the sequence alignment. To assess sequence 
variability in the positions identified in Figure 2, the 
variable domain sequences collected in the Kabat database 
(status March 1996) were analyzed (Table 1) . Of the 15 
30 interface residues identified in the VL domain of the antibody 
4-4-20 (Fig. 1 and Table 1), L9(leu), L12(pro), LlS(leu), 
L40(pro), L83(leu), and L106(ile) are hydrophobic and 
therefore candidates for replacement. Of the 16 interface 
residues in the VH domain, Hll (leu) , H14 (pro) , H41 (pro) , 
35 H84(val), H87 (met) and H89(ile) were identified as possible 
candidates for substitution by hydrophilic residues in the 



scFv fragment of the antibody 4-4-20 (Fig. 1 and Table 1). 

Not all of these hydrophobic residues are equally 
good candidates for replacements, however. While residues 
which are hydrophobic in one particular sequence but 
hydrophilic in many other sequences may ' appear most 
attractive, the conserved hydrophobic residues listed in Table 
1 have also been investigated, since the evolutionary pressure 
which kept these conserved residues acted on the Fab fragment 
within the whole antibody, but not the isolated Fy portion. 
In this study, we did not replace the proline residues since 
pro L40 and pro H41 form the hairpin turns at the bottom of 
the framework II region, while the conserved VL cis-proline L8 
and proline residues H9 and H14 determine the shape of 
framework I of the immunoglobulin variable domains. 

Excluding prolines, this leaves residues L9 (leu in 
4-4-20, ser in most kappa chains), L15 (leu, usually 
hydrophobic), L83 (leu, usually val or phe) and L106 (lie, as 
in 86% of all kappa chains) in the VL domain and HX1 (leu as 
in 60% of all heavy chains), H84 (val, in other VH domains 
frequently ala or ser), H87 (met, usually ser) and H89 (ile, 
most frequently val) in VH as possible candidates for 
replacement in the 4-4-20 scFv fragment. 

(b) Mutations in the 4-4-20 scFv 

For the 4-4-20 scFv fragment some of the crucial 
residues identified in the sequence analysis described above 
are already hydrophilic, but nevertheless 9 residues are of 
hydrophobic nature (including prol2 in the light chain) (Table 
1). We chose three residues for closer analysis by mutations. 

LeulS in VL is a hydrophobic amino acid in 98 % of 
all kappa chains (Table 1). Leuli is conserved in VH (Table 
1) and is involved in v/c interdomain contacts (Lesk & 
Chothia, 1988) . In contrast, valine occurs very infrequently 
at position H84; mainly found at this position are threonine 
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or serine and alanine (Table 1). . As., .can be seen in Figure, 1, 
val84 is contributing to a large hydrophobic patch at the 
newly exposed surface of VH. All three positions were mutated 
into acidic residues, and Lll was also changed to asparagine 
5 (Table 2) . 

The scFv fragment was tested and expressed with two 
different linkers, the 15-mer linker (Gly4Ser) 3 (Huston et 
al., 1995) and the same motif extended "to 30 amino acids 
(Gly4Ser)6. All mutations were tested in both constructs. 

10 • The in vivo results of the different mutations on solubility 
were identical, and therefore only the results of the 30-mer 
linker are described in more detail. The periplasmic 
expression experiments were carried out at 24.5'C, and all 
constructs were tested for soluble and insoluble protein by 

IS"-.* • immunoblotting. The ratio of insoluble to soluble (i/s) 

protein was determined for every mutant. In Figure 3 A-D, 
insoluble {lane 1) and soluble (lane 2) fractions of the wt 
scFv are shown. Nearly no soluble material occurs in 
periplasmic expression, which is consistent with previous 

20 reports of Bedzyk et al. (1990) and Denzin et al. (1991), who 
described earlier that the periplasmic expression of the wt 
scFv leads mainly to periplasmic inclusion bodies. 

The single point mutation L15E in VL (Flul) shows no 
effect on the ratio i/s when compared with the wt (Fig. 3A, 

25 lane 3, 4). Mutating leu at position 11 in the heavy chain to 
asparagine (Flu2) also shows nearly no effect compared to the 
wt, whereas the subtitution with aspartic acid (Flu3) changes 
the i/s ratio to more soluble protein, but still this effect 
is not very dramatic. In contrast, the point mutation at 

30 position 84 (Flu4, Fig. 3B, lane 3, 4 and Fig 3D, lane 3, 4) 
had a dramatic influence on the solubility of the scFv 
fragment of the antibody 4-4-20. The ratio i/s is changed to 
about 1:1, resulting in a 25-fold increase of soluble protein 
compared to the wt . 

35 The combination of V84D with L11N or L11D (Flu5, 

Flu6) also changes the ratio i/s compared to the wt, but this 
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ratio compared to V84D alone is not improved further (Fig. 
3B) . Interestingly, the combination of Flu5 with the light 
chain mutation at position 15 (Flu9) leads to less soluble 
material (Fig. 3C lane 7,8) than Flu5 itself (Fig. 3B, lane 5, 
5 6) . The negative influence of the L15E mutations can also be 
seen in Flu8 (Fig. 3C, lane 5, 6} compared with Flu3 (Fig. 3A, 
lane 7, 8) . In Fig. 3D the comparison of the wt (lane 1, 2 
and 5, 6) and Flu4 (lane 3, 4 and 7, 8) is shown in both the 
15-mer and the 30-mer construct. 

10 The negative effect of L15E can be rationalized by 

looking at a model of the 4-4-20 scFv fragment. L15 is 
forming a hydrophobic pocket together with residues A80, L83, 
and L106. Apparently, L15 stabilizes the scFv fragment by 
hydrophobic interactions with its closest neighbours. Thus 

15 the exchange L15E for making the scFv fragment more 

hydrophilic and more soluble is made at the expense of the 
fragment stability. The analysis of hydrophobic interactions 
within a fragment should thereby by used to choose the 
solvent-exposed residues to be mutated in the case of any 

20 other antibody fragment. 

Combinations of various serine mutations in VH led to further 
improvements in the i/s ratio. The mutants FH15 (V84S, M87S, 
I89S) and FH20 (L11S, V84S, M87S, I89S) both showed more than 
70% of soluble protein in immunoblots (data not shown) . 

25 The negative effect of L15E 

(c) Functional expression and purification 

The oligomerization of scFv fragments as a function 
30 of linker length has been investigated previously. A 

continuous decrease in the amount of dimer and multimer 
formation as a function of linker length has been reported 
(Desplancq et al., 1994; Whitlow et al . , 1994). While the 
(Gly4Ser)3 linker has been shown to lead to monomeric scFvs in 
35 many cases in the VH-VL direction, this is often not the case 
in the VL-VH direction. This is caused by an asymmetry in the 
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VL/VH arrangement, leading to a longer distance between the 
end of VH and the N-terminus of VL than between C-terminus of 
VL and N-terminus of VH (Huston et al., 1995). Consequently, 
a linker of identical length may lead to different properties 
of the resulting molecules. 

Since we have chosen to use the minimal pertubation 
FLAG (Knappik & Pluckthun, 1994) at the N-terminus of VL in 
our constructs and thus the VL-linker-VH orientation, we have 
investigated the use of longer linkers. In the periplasmic 
expression in E. coli no difference between the 15-mer and the 
30-mer linker. in the corresponding mutants is visible (Fig. 
3D), but when we attempted to purify the two Flu4 scFvs with 
long and short linker, a big discrepancy between the two 
constructs was found. The purification of the Flu4 mutant 
(V84D) with the 15-mer linker leads to very small amounts of 
partially purified protein (about 0.015 mg per liter and OD; 
estimated from SDS-PAGE after IMAC purification), whereas the 
30-mer linker construct gives about 0.3 mg per liter and OD of 
highly pure functional protein. All mutants with 30-mer 
linker were tested in gel filtration and found to be monomeric 

(data not shown) . 

For further in vitro characterization five mutants 
were purified with the 30-mer linker, V84D (Flu4), V84D/L11D 
(Flu6), L11D (Flu3), and the serine mutants FH15 and FH20 (see 
iii(b)). A two-step chromatography, first using IMAC and then 
cation-exchange chromatography, led to homogeneous protein. 
The i/s ratio of the antibody fragments (Fig. 3) was also 
reflected in the purification yield of functional protein. 
The highly soluble mutant Flu4 (V84D) (Fig. 3B lane 3, 4) 
yielded about 0.3 mg purified and functional protein per liter 
and OD, Flu6 (L11D/V84D) (Fig. 3B lane 7, 8) yielded about 
0.25 mg per liter and OD and Flu3 (less soluble material on 
the blot in Fig. 3A lane 7, 8) yielded 0.05 mg per liter and 
OD. The serine mutants FH15 and FH20 yielded 0 . 3 mg and 0.4 
mg per liter and OD, respectively. The wt scFv of the 
antibody 4-4-20 did not give any soluble protein at all in 
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periplasmic expression with either linker, and it was 
therefore expressed as cytoplasmic inclusion bodies, followed 
by refolding in vitro and fluorescein affinity chromatography. 
The refolded wt scFv was shown by gel filtration to be 
5 monomeric with the 30-mer linker (data not shown) . 

(d) Biophysical properties of the mutant scFvs 

Since we changed amino acids which are conserved, it 
10 cannot be excluded that changes at these positions may be 

transmitted through the structure and have an effect on the 
binding constant, even though they are very far from the 
binding site (Chatellier et al., 1996). To eliminate this 
possibility, we determined the binding constant of the mutants 
15 Flu3, Flu4, Flu6 and the wt scFv. Fluorescence titration was 
used to determine KD in solution by using the quenching of the 
intrinsic fluorescence of fluorescein when it binds to the 
antibody. The fluorescence quenching at 510 nm was measured 
as a function of added scFv. The KD values (Table 3 and Fig. 
20 4) obtained for all three mutant scFvs and the wt scFv are 
very similar and correspond very well to the recently 
corrected KD of the monoclonal antibody 4-4-20 {Miklasz et 
al., 1995) . 

To determine whether the mutations had an influence 
25 on the thermodynamic stability of the protein we determined 
the equilibrium unfolding curves by urea denaturation . V84D 
mutant and the wt scFv were used for this analysis, and in 
Figure 5 an overlay plot is shown. The midpoint of both 
curves is at 4.1 M urea. Both curves were fitted by an 
30 algorithm for a two-state model described by Pace (1990), but 
the apparent small difference between the V84D mutant and the 
wt scFv is not of statistical significance. Aggregation of 
folding intermediates could be another explanation for the 
different in vivo results between the mutant scFvs and the wt 
35 scFv (Fig. 3). In the periplasm of E. coli, the protein 

concentrations are assumed to be rather high (van Wielink & 



-24- 

Duine, 1990) and the aggregation effects could thus be 
pronounced. In order to estimate the aggregation behavior in 
vitro, we have measured the thermal aggregation rates at 
different temperatures. In Figure 6 it is clearly seen that 
the wt scFv is significantly aggregating already at 44'C, 
whereas the mutant V84D tends to aggregate more slowly. The 
wt scFv is thus clearly more aggregation prone than the mutant 
scFv. This is very similar to the observations made with 
different mutations on the antibody McPC603 (Knappik and 
Pliickthun, 1995), where no correlation was found between 
equilibrium denaturation curves and expression behavior, but a 
good correlation was found with the thermal aggregation rates. 
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